Goto

Collaborating Authors

 fail fast


Fail Fast, or Ask: Mitigating the Deficiencies of Reasoning LLMs with Human-in-the-Loop Systems Engineering

arXiv.org Artificial Intelligence

State-of-the-art reasoning LLMs are powerful problem solvers, but they still occasionally make mistakes. However, adopting AI models in risk-sensitive domains often requires error rates near 0%. To address this gap, we propose collaboration between a reasoning model and a human expert who resolves queries the model cannot confidently answer. We find that quantifying the uncertainty of a reasoning model through the length of its reasoning trace yields an effective basis for deferral to a human, e.g., cutting the error rate of Qwen3 235B-A22B on difficult MATH problems from 3% to less than 1% when deferring 7.5% of queries. However, the high latency of reasoning models still makes them challenging to deploy on use cases with high query volume. To address this challenge, we explore fronting a reasoning model with a large non-reasoning model. We call this modified human-in-the-loop system "Fail Fast, or Ask", since the non-reasoning model may defer difficult queries to the human expert directly ("failing fast"), without incurring the reasoning model's higher latency. We show that this approach yields around 40% latency reduction and about 50% cost savings for DeepSeek R1 while maintaining 90+% area under the accuracy-rejection curve. However, we observe that latency savings are lower than expected because of "latency drag", the phenomenon that processing easier queries with a non-reasoning model pushes the reasoning model's latency distribution towards longer latencies. Broadly, our results suggest that the deficiencies of state-of-the-art reasoning models -- nontrivial error rates and high latency -- can be substantially mitigated through black-box systems engineering, without requiring access to LLM internals.


How to launch--and scale--a successful AI pilot project

#artificialintelligence

At the US Patent & Trademark Office in Alexandria, Virginia, artificial intelligence (AI) projects are expediting the patent classification process, helping detect fraud, and expanding examiners' searches for similar patents, enabling them to search through more documents in the same amount of time. And every project started with a pilot project. "Proofs of concept (PoCs) are a key approach we use to learn about new technologies, test business value assumptions, de-risk scale project delivery, and inform full production implementation decisions," says USPTO CIO Jamie Holcombe. Once the pilot proves out, he says, the next step is to determine if it can scale. Indian e-commerce vendor Flipkart has followed a similar process before deploying projects that allow for text and visual search through millions of items for customers who speak 11 different languages.


Fail Fast, Learn Faster: Lessons in Data-Driven Leadership in an Age of Disruption, Big Data, and AI: 9781119806226: Business Development Books @ Amazon.com

#artificialintelligence

The lessons of this book are delivered in clear language and without technical jargon. I've worked with Randy Bean for almost twenty years, and I've read a lot of his writing. He prides himself on his ability to communicate about technical subjects to people with no technical backgrounds. If you are someone in a business role who has heard about such topics as big data, artificial intelligence, and digitization, and you want to know what all the fuss is about without getting lost in technical detail, you have come to the right place.


For successful AI projects, celebrate your graveyard and be prepared to fail fast โ€“ TechCrunch

#artificialintelligence

AI teams invest a lot of rigor in defining new project guidelines. But the same is not true for killing existing projects. In the absence of clear guidelines, teams let infeasible projects drag on for months. They put up a dog and pony show during project review meetings for fear of becoming the messengers of bad news. By streamlining the process to fail fast on infeasible projects, teams can significantly increase their overall success with AI initiatives.


Council Post: Covid-19 Has Accelerated Digital Transformation -- With AI Playing A Key Role

#artificialintelligence

Long before the Covid-19 pandemic, businesses had been on a steady path toward digital transformation to achieve vast improvements in worker productivity, public health and safety, quality of products, services and customer experiences and even to obtain a sustainable planet and a circular economy. The benefits of the next digital era seem almost endless, but the challenges of adopting the technologies that will enable it to transpire -- AI, machine learning and deep learning at the edge (where rapid automation takes place) -- have made businesses pause because it forces great behavioral and structural changes, like new business models, operating procedures, worker skill sets and mindsets. It can even affect cultures. These have been some of the biggest stumbling blocks to reaching the next digital era -- until recently. With our livelihoods at risk, the pandemic has served as a wake-up call to expedite the timeline for digital transformation exponentially.


Why 'Fail Fast' Is a Disaster When It Comes to Artificial Intelligence

#artificialintelligence

"Fail fast" is a well-known phrase in the startup scene. The spirit of failing fast is getting to market with a minimum viable product and then rapidly iterating toward success. Failing fast acknowledges that entrepreneurs are unlikely to design a successful end-state solution before testing it with real customers and real consequences. This is the "ready, fire, aim" approach. Or, if the blowback is big enough, it's the "ready, fire, pivot" approach.


How Will AI Disrupt Pre-Digital Businesses? - Disruption Hub

#artificialintelligence

We all know that Google, Facebook and Amazon, are investing heavily in AI, with results widely reported. It's no surprise that businesses built around collecting and analysing data are leading the way in AI. Businesses built in the pre-digital age, based on physical products and infrastructure. Business that underpin transport and energy, or that develop new medicines or materials. How should these businesses take advantage of AI?


Want to Win in the Age of A.I., Automation, and Algorithms? Follow 3 Simple Rules

#artificialintelligence

About six weeks ago I received an email from somebody called Amy Ingram. It was a friendly, professional email to schedule a meeting with the CEO of an exciting new start-up I was writing about for my Future Proof column in Inc. Not Who I Thought She Was. After a couple of email exchanges the meeting was confirmed and I thanked Amy for her time. When I got to meet with the CEO in person later that week, he looked at me with a glint in his eye and asked, in a rather curious tone, "What did you think of Amy Ingram?"


Chatbot Best Practices in Contact Centers โ€“ Jim Rembach โ€“ Medium

#artificialintelligence

The rate of change in contact centers is accelerating. Fueled by the worldwide proliferation of mobile devices, more interactions are pointed to your contact center than ever before. To survive, you must learn about chatbot best practices in contact centers NOW. Dr. Yi Zhang, world renowned thought leader on Artificial Intelligence (AI) and professor at the University of California at Santa Cruz has worked with organizations like Alibaba, HP, Toyota, Ex Libris, and the University of Pittsburgh Medical Center on AI. In an interview with Dr. Zhang, she explains the difference between the various generations of chatbot technology found in contact centers, websites, or in an APP experience.


If you're not using big data, you're about to fail fast

#artificialintelligence

'Data is not about insights, it's about generating money,' says Rubikloud's chief product officer Ever since the financial crash of 2008, businesses around the world have struggled to grow at the same levels they once enjoyed โ€“ but big data and machine learning could help turn things around, and help companies reconnect with customers. "The reality now is this plateaued, zero-growth type of world, where you're in the 0.5-1 per cent [range]," Ayoub said, speaking at WIRED Retail 2016. There's a lot of companies that do more small scale retail that are seeing enormous amounts of growth but in general, this is the kind of climate we're living in." Ayoub isn't concerned so much with why this is happening, but rather how retailers are reacting to the change. One way businesses have tried to keep their costs down is by consolidating, then leveraging greater purchasing power to buy stock at lower prices. "Big fish buy big fish, then bigger fish buy them," said Ayoub. "That allows them to pressure vendors to push down prices.